Arm backend: Add 16A8W support and test for add operation #13789

Ninja91 · 2025-08-29T04:20:09Z

Stack from ghstack (oldest at bottom):

Add 16A8W quantization support and comprehensive tests for the add operation in ExecutorTorch ARM backend targeting Ethos U55 and U85 NPUs.

This follows the pattern established for linear operations, extending int16 support to add operations with hardware-specific testing.

Changes:

Add INT16 dtype validation support in op_add.py
Add test_add_tensor_16a8w_tosa_INT test function with U55/U85 pipeline support
Add U55 and U85 specific 16A8W tests with proper xfail decorators
Fix U55/U85 test parameter usage (remove unsupported tosa_extensions, clean quantizer function calls)
Update xfail reasons to consistent 'Vela compilation fails with Invalid arguments' pattern

Differential Revision: D80510463

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218

Add 16A8W quantization support and test for the add operation in ExecutorTorch ARM backend. This follows the pattern established for linear operations, extending int16 support to add operations. Changes: - Add INT16 dtype validation support in op_add.py - Add test_add_tensor_16a8w_tosa_INT test function - Enable test_add.py in test targets configuration The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463/) [ghstack-poisoned]

pytorch-bot · 2025-08-29T04:20:13Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13789

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 1 Cancelled Job

As of commit 3dbf93f with merge base 1a7441f ():

NEW FAILURES - The following jobs have failed:

Build documentation / build (buck2) / Build doc (gh)
At least one of the pre-conditions you specified did not hold
trunk / test-llama-runner-mac (fp32, coreml) / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 1
trunk / test-models-macos-coreml (dl3) / macos-job (gh)
AttributeError: partially initialized module 'torchvision' has no attribute 'extension' (most likely due to a circular import)

CANCELLED JOB - The following job was cancelled. Please retry:

trunk / test-models-macos-cpu (mv3, xnnpack-quantization-delegation) / macos-job (gh)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Add 16A8W quantization support and test for the add operation in ExecutorTorch ARM backend. This follows the pattern established for linear operations, extending int16 support to add operations. Changes: - Add INT16 dtype validation support in op_add.py - Add test_add_tensor_16a8w_tosa_INT test function - Enable test_add.py in test targets configuration The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. Differential Revision: [D80510463](https://our.internmc.facebook.com/intern/diff/D80510463/) ghstack-source-id: 305897355 Pull Request resolved: #13789

facebook-github-bot · 2025-08-29T04:20:20Z